Statistical Analysis of Some Multi-Category Large Margin Classification Methods

نویسنده

  • Tong Zhang
چکیده

The purpose of this paper is to investigate statistical properties of risk minimization based multicategory classification methods. These methods can be considered as natural extensions of binary large margin classification. We establish conditions that guarantee the consistency of classifiers obtained in the risk minimization framework with respect to the classification error. Examples are provided for four specific forms of the general formulation, which extend a number of known methods. Using these examples, we show that some risk minimization formulations can also be used to obtain conditional probability estimates for the underlying problem. Such conditional probability information can be useful for statistical inferencing tasks beyond classification. 1. Motivation Consider a binary classification problem where we want to predict label y∈ {±1} based on observation x. One of the most significant achievements for binary classification in machine learning is the invention of large margin methods, which include support vector machines and boosting algorithms. Based on a set of training samples (X1,Y1), . . . ,(Xn,Yn), a large margin binary classification algorithm produces a decision function f̂ (·) by minimizing an empirical loss function that is often a convex upper bound of the binary classification error function. Given f̂ (·), the binary decision rule is to predict y = 1 if f̂ (x)≥ 0, and to predict y =−1 otherwise (the decision rule at f̂ (x) = 0 is not important). In the literature, the following form of large margin binary classification is often encountered: we minimize the empirical risk associated with a convex function φ in a pre-chosen function class Cn that may depend on the sample size: f̂ (·) = arg min f (·)∈Cn 1 n n ∑ i=1 φ( f (Xi)Yi). (1) Originally such a scheme was regarded as a compromise to avoid computational difficulties associated with direct classification error minimization, which often leads to an NP-hard problem. Some recent works in the statistical literature argued that such methods could be used to obtain conditional probability estimates. For example, see Friedman et al. (2000), Lin (2002), Schapire and Singer (1999), Zhang (2004), Steinwart (2003) for related studies. This point of view allows people to show the consistency of various large margin methods: that is, in the large sample limit, the obtained classifiers achieve the optimal Bayes error rate. For example, see Bartlett et al. (2003), Jiang (2004), Lugosi and Vayatis (2004), Mannor et al. (2003), Steinwart (2002, 2004), Zhang

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Infinity-sample Theory for Multi-category Large Margin Classification

The purpose of this paper is to investigate infinity-sample properties of risk minimization based multi-category classification methods. These methods can be considered as natural extensions to binary large margin classification. We establish conditions that guarantee the infinity-sample consistency of classifiers obtained in the risk minimization framework. Examples are provided for two specif...

متن کامل

Providing a New Model to Improving DEA-based Models in Multi-criteria Inventory Classification (Case Study: Pars Khazar)

Abstract Objective: Many organizations use the ABC classification method to control their large amount of inventories. The most common way to classify inventories is the ABC method. In traditional ABC classification, items are only classified according to one criteria. But there are other criteria that need to be considered in the inventory classification. The purpose of this study is to prese...

متن کامل

VC Theory of Large Margin Multi-Category Classifiers

In the context of discriminant analysis, Vapnik’s statistical learning theory has mainly been developed in three directions: the computation of dichotomies with binary-valued functions, the computation of dichotomies with real-valued functions, and the computation of polytomies with functions taking their values in finite sets, typically the set of categories itself. The case of classes of vect...

متن کامل

Lp-norm Sauer-Shelah lemma for margin multi-category classifiers

In the framework of agnostic learning, one of the main open problems of the theory of multi-category pattern classification is the characterization of the way the complexity varies with the number C of categories. More precisely, if the classifier is characterized only through minimal learnability hypotheses, then the optimal dependency on C that an upper bound on the probability of error shoul...

متن کامل

Multi-Group Classification Using Interval Linea rProgramming

  Among various statistical and data mining discriminant analysis proposed so far for group classification, linear programming discriminant analysis has recently attracted the researchers’ interest. This study evaluates multi-group discriminant linear programming (MDLP) for classification problems against well-known methods such as neural networks and support vector machine. MDLP is less compli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2004